# this loads all the rsa functions, rewritten for ease of reading
source("setup.r")
## >> these packages are now attached:
## lefftpack reshape2 ggplot2 magrittr dplyr
## >> custom `ggplot2::` plot theme is now set
Participants were presented with a target scalar item and a star rating (1–5 stars) and asked to judge the compat- ibility of the scalar item and star rating. Compatibility was assessed through a binary “yes/no” response to a question of the form, “Do you think that the person thought the food was ____ ?” where a target scalar was presented in the blank. Each participant saw all scalar item and star rating combinations for their particular condition, in a random order.
The two alternatives condition only included entailment pairs taken from van Tiel et al. (2014) for a total of 50 trials for each participant. The four alternatives condition included entailment pairs plus the top two alternatives generated for each scalar family by participants in Experiment 1 (100 trials per participant). The five alternatives condition included the four previous items plus one more neutral item chosen from alternatives generated in Experiment 1 (125 trials per participant). [SEE FIGURE 3, LEFT]
knitr::include_graphics("input/pf16_fig3_left.png")
Literal listener judgments from Experiment 2. Proportion of participants indicating compatibility (answering “yes”) is shown on the vertical axis, with the horizontal axis showing number of stars on which the utterance was judged. Rows are grouped by scale and items within rows are ordered by valence. Colors indicate the specific condition with conditions including different numbers of items.
knitr::include_graphics("input/pf16_fig4.png")
Participants were presented with a one-sentence prompt such as “Someone said they thought the food was ____ .” with a target scalar item in the blank. Partic- ipants were then asked to generate a star rating representing the rating they thought the reviewer likely gave. Each participant was presented with all scalar items in a random order. Participants in the two alternatives condition gave a total of 10 pragmatic judgments. Participants in the four alternatives condition gave a total of 20 pragmatic judgments. [SEE FIGURE 3, RIGHT]
knitr::include_graphics("input/pf16_fig3_right.png")
Pragmatic listener judgments from Experiment 3. Vertical axis shows proportion of participants generating a star rating. Horizontal axis shows number of stars on which the utterance was judged. Line type denotes condition and colors indicate the particular scalar items. Each panel shows one entailment scalar pair.
knitr::include_graphics("input/pf16_fig5.png")
Using literal listener data from Experiment 2, we conducted a set of simulations with the RSA model. Each simulation kept the model constant, fitting the choice parameter α as a free parameter, but used a set of alternatives to specify the scale over which predictions were computed. We considered four different alternative sets, with empirical measurements corresponding to those shown in Figure 1: 1) the two alternatives in the classic entailment scales, 2) those two alternatives with the addition of a generic negative alternative, 3) the expanded set of four alternatives, and 4) the expanded set of five alter- natives. Literal semantics for the generic negative alternative served as a baseline “none” semantics in which the scalar item was only compatible with 1 star. [SEE FIGURE 1]
knitr::include_graphics("input/pf16_fig1.png")
Model fit with human judgments was significantly improved by the inclusion of alternatives beyond the entailment items (Table 1). The two alternatives model contained only entailment items, which, under classic accounts, should be sufficient to generate implicature, but fit to data was quite poor with these items. The addition of a generic negative element produced some gains, but much higher performance was found when we included four and five alternatives, with the alternatives derived empirically for the specific scale we used. [SEE FIGURE 6]
knitr::include_graphics("input/pf16_fig6.png")
knitr::include_graphics("input/pf16_table1.png")
Below is a listing of the contents of the rrrsa:: package.
# FIVE DATASETS
rsa_data <- list(
# from frank et al submitted:
frank = rrrsa::d_pragmods,
# from pelfrank16:
pf2 = rrrsa::peloquinFrank_2Alts,
pf3 = rrrsa::peloquinFrank_3Alts,
pf4 = rrrsa::peloquinFrank_4Alts,
pf5 = rrrsa::peloquinFrank_5Alts
)
# THREE HELPER FUNCTIONS
rsa_helper <- list(
# converts vec2 to whatever type vec1 is
convertVecType = rrrsa::rsa.convertVecType,
# this is just `function(x) x/sum(x)` for positive `x`
normVec = rrrsa::rsa.normVec,
# just changes desired column names
renameCol = rrrsa::rsa.renameCol
)
# SIX ANALYSIS FUNCTIONS
rsa_model <- list(
# computes informativity given params `m_u`, `alpha`, `cost`
informativity = rrrsa::rsa.informativity,
# computes utility given params `items`, `costs`, `alpha`
utility = rrrsa::rsa.utility,
# wrapper that runs `depth`-many iterations of `rsa.fullRecursion()`
reason = rrrsa::rsa.reason,
# wrapper around `rsa.reason` that accepts input as a data frame
runDf = rrrsa::rsa.runDf,
# explores correlation between data and model preds for varying `alpha`
tuneDepthAlpha = rrrsa::rsa.tuneDepthAlpha,
# main function implementing RSA model
fullRecursion = rrrsa::rsa.fullRecursion
)
semantics_matrix <- matrix(
nrow=5, ncol=3, dimnames=list(paste0("row", 1:5), c("none","some","all")),
data=c(1.0, .00, .00,
.00, .00, .00,
.25, .25, .25,
.25, .00, .00,
.00, .00, 1)
)
# semantics_df <- rsa_matrix_to_df(semantics_matrix,
# response_colname="semantics")
### NEED TO MAKE SURE THESE MATCH THE SHAPE OF THE SEMANTICS MATRIX
# m/dat_pre ~~> matrix of semantics (rows=meaning (m rows), cols=words (n cols))
# costs ~~> ncol(m) vector of costs (default is 0 valued vector)
# priors ~~> nrow(m) vector of priors (default is uniform)
# alpha ~~> decision hyper-param
costs <- c(none=0, some=0, all=0)
priors <- rnorm(nrow(semantics_matrix), mean=.5, sd=.1)
alpha <- 1
print(costs)
## none some all
## 0 0 0
print(priors)
## [1] 0.6510629 0.5675926 0.4438796 0.6840819 0.5763901
# same as:
# rsa.fullRecursion(m=m, costs=rep(0, ncol(m)), priors=rep(1, nrow(m)), alpha=2)
pragmatics_matrix <- fullRecursion(
m=semantics_matrix,
costs=costs, priors=priors, alpha=alpha
)
# pragmatics_df <- rsa_matrix_to_df(semantics_matrix,
# response_colname="pragmatics")
knitr::kable(semantics_matrix, row.names=TRUE)
| none | some | all | |
|---|---|---|---|
| row1 | 1 | 0.00 | 0 |
| row2 | 0 | 0.25 | 0 |
| row3 | 0 | 0.25 | 0 |
| row4 | 0 | 0.25 | 0 |
| row5 | 0 | 0.25 | 1 |
knitr::kable(pragmatics_matrix, row.names=TRUE)
| none | some | all | |
|---|---|---|---|
| row1 | 1 | 0.0000000 | 0 |
| row2 | 0 | 0.3134430 | 0 |
| row3 | 0 | 0.2451247 | 0 |
| row4 | 0 | 0.3777721 | 0 |
| row5 | 0 | 0.0636603 | 1 |
make_rsa_plot(semantics_matrix, pragmatics_matrix)
# each dataset from the paper has these columns:
nams <- c("exp", "scale", "stars", "speaker.p", "words", "e11", "e6")
# column descriptions from documentation:
#
# 1) quantityVarName :: "stars"
# 2) semanticsVarName :: "speaker.p"
# 3) itemVarName :: "words"
# 4) groupVarName :: "scale"
# other) experiment number :: "exp"
# other) Pragmatic judgments study 1 (e6), study 2 (e11)
# dimensions and colnames of each dataset
sapply(rsa_data, dim)
## frank pf2 pf3 pf4 pf5
## [1,] 135 50 75 100 125
## [2,] 16 7 7 7 7
sapply(rsa_data, function(df){ identical(names(df), nams) })
## frank pf2 pf3 pf4 pf5
## FALSE TRUE TRUE TRUE TRUE
# just want to consider the data from the peloquin & frank paper
pf_data <- rsa_data[names(rsa_data)[names(rsa_data) != "frank"]]
# combine all of the PF16 datasets into a single df
pf_data <- lapply(seq_along(pf_data), function(idx){
pf_data[[names(pf_data)[idx]]] %>%
mutate(dataset = names(pf_data)[idx]) %>%
mutate_if(is.factor, as.character) %>%
rename(study1=e6, study2=e11)
}) %>%
(function(df_list) do.call(rbind, df_list))
# and look at its structure
str(pf_data)
## 'data.frame': 350 obs. of 8 variables:
## $ exp : chr "e8" "e8" "e8" "e8" ...
## $ scale : chr "good_excellent" "good_excellent" "good_excellent" "good_excellent" ...
## $ stars : int 1 2 3 4 5 1 2 3 4 5 ...
## $ speaker.p: num 0 0.0345 0.069 0.4138 0.4828 ...
## $ words : chr "excellent" "excellent" "excellent" "excellent" ...
## $ study2 : num 0 0 0 0.302 0.698 ...
## $ study1 : num 0 0 0 0.0732 0.9268 ...
## $ dataset : chr "pf2" "pf2" "pf2" "pf2" ...
# each row is a unique combo of scale, stars, words, dataset
pf_data %>% group_by(scale, stars, words, dataset) %>% count() %$% table(n)
## n
## 1
## 350
NOTE: these are by-item averages for Expt2 (the number of trials for 2, 4, and 5 alternatives matches description on page 321)
d <- pf_data %>%
filter(dataset %in% c("pf2", "pf4", "pf5"))
knitr::kable(d %>% group_by(dataset) %>% summarize(rows = n()))
| dataset | rows |
|---|---|
| pf2 | 50 |
| pf4 | 100 |
| pf5 | 125 |
Here are the words used for each scale in Expt2:
d %>% group_by(dataset, scale) %>% summarize(
rows = n(),
words = paste(sort(unique(words)), collapse=" | ")
) %>% knitr::kable()
| dataset | scale | rows | words |
|---|---|---|---|
| pf2 | good_excellent | 10 | excellent | good |
| pf2 | liked_loved | 10 | liked | loved |
| pf2 | memorable_unforgettable | 10 | memorable | unforgettable |
| pf2 | palatable_delicious | 10 | delicious | palatable |
| pf2 | some_all | 10 | all | some |
| pf4 | good_excellent | 20 | bad | excellent | good | horrible |
| pf4 | liked_loved | 20 | disliked | hated | liked | loved |
| pf4 | memorable_unforgettable | 20 | bland | forgettable | memorable | unforgettable |
| pf4 | palatable_delicious | 20 | delicious | disgusting | gross | palatable |
| pf4 | some_all | 20 | all | most | none | some |
| pf5 | good_excellent | 25 | bad | excellent | good | horrible | okay |
| pf5 | liked_loved | 25 | disliked | hated | indifferent | liked | loved |
| pf5 | memorable_unforgettable | 25 | bland | forgettable | memorable | ordinary | unforgettable |
| pf5 | palatable_delicious | 25 | delicious | disgusting | gross | mediocre | palatable |
| pf5 | some_all | 25 | all | little | most | none | some |
Unclear which column is plotted in Figure 4, so try all three and inspect results of each.
d %>%
ggplot(aes(x=stars, y=speaker.p, color=scale, linetype=dataset)) +
geom_line() +
facet_wrap(~words) +
labs(title="fig4(?) -- plotting column d$speaker.p")
d %>%
ggplot(aes(x=stars, y=study1, color=scale, linetype=dataset)) +
geom_line() +
facet_wrap(~words) +
labs(title="fig4(?) -- plotting column d$study1")
d %>%
ggplot(aes(x=stars, y=study2, color=scale, linetype=dataset)) +
geom_line() +
facet_wrap(~words) +
labs(title="fig4(?) -- plotting column d$study2")
lo_words <- c("good","liked","memorable","palatable","some")
hi_words <- c("excellent","loved","unforgettable","delicious","all")
d <- pf_data %>%
filter(dataset %in% c("pf2", "pf4")) %>%
filter(words %in% c(lo_words, hi_words))
table(d$dataset) # color is 4alts, 2alts
##
## pf2 pf4
## 50 50
table(d$scale) # facet by scale
##
## good_excellent liked_loved memorable_unforgettable
## 20 20 20
## palatable_delicious some_all
## 20 20
table(d$words)
##
## all delicious excellent good liked
## 10 10 10 10 10
## loved memorable palatable some unforgettable
## 10 10 10 10 10
d$scale_pos <- case_when(
d$words %in% lo_words ~ "lo",
d$words %in% hi_words ~ "hi"
)
# two rows per combo, from `$exp` column (e8 and e10)
d_summary <- d %>% group_by(scale, scale_pos, stars) %>% summarize(
rows = n(),
mean_speaker_p = mean(speaker.p),
mean_study1 = mean(study1),
mean_study2 = mean(study2)
) %>% ungroup()
d_summary %>%
# also include "mean_speaker_p" to show model predictions(????)
melt(measure.vars=c("mean_study1", "mean_study2")) %>%
ggplot(aes(x=stars, y=value, color=variable)) +
# geom_point() +
geom_line(aes(linetype=scale_pos)) +
facet_wrap(~scale, nrow=1) +
labs(title="reproduction of figure 5 from PF16")
d_summary %>% ggplot(aes(x=stars, y=mean_study1, color=scale_pos)) +
geom_point() +
geom_line() +
facet_wrap(~scale)
# carve out the <none, some, all> scale from three-alts(??)
# pf_data %>% filter(scale=="some_all") %>% View()
pf3_NSA <- pf_data %>%
filter(dataset=="pf3") %>%
filter(scale=="some_all") %>%
select(-exp, -scale, -dataset) %>%
select(stars, words, study1, study2, speaker.p) %>%
arrange(stars, words)
# all response fields are 0 for none, across the whole scale...
# pf3_NSA %>% filter(words=="none") %>% View() # ***
pf3_NSA_study1 <- pf3_NSA %>%
select(stars, words, value=study1) %>%
tidyr::spread(words, value) %>%
set_rownames(.$stars) %>% # `stars` now encoded as rownames
select(none, some, all) %>%
as.matrix()
# this is the semantic representation (prob dist) of each scalar item
input_matrix <- pf3_NSA_study1
# need to make sure length matches ncol
costs <- c(none=0, some=0, all=0)
# need to make sure length matches nrow
priors <- rnorm(nrow(input_matrix), mean=.5, sd=.1)
# parameter that fixes "how much the cost matters"
alpha <- 1
# calculate the posterior distributions for each scalar item
output_matrix <- fullRecursion(
input_matrix, costs=costs, priors=priors, alpha=alpha
)
knitr::kable(input_matrix, row.names=TRUE)
| none | some | all | |
|---|---|---|---|
| 1 | 0 | 0.0000000 | 0.0000000 |
| 2 | 0 | 0.1463415 | 0.0000000 |
| 3 | 0 | 0.8536585 | 0.0731707 |
| 4 | 0 | 0.0000000 | 0.4146341 |
| 5 | 0 | 0.0000000 | 0.5121951 |
knitr::kable(output_matrix, row.names=TRUE)
| none | some | all | |
|---|---|---|---|
| 1 | 0 | 0.0000000 | 0.0000000 |
| 2 | 0 | 0.4612364 | 0.0000000 |
| 3 | 0 | 0.5387636 | 0.0590709 |
| 4 | 0 | 0.0000000 | 0.3141950 |
| 5 | 0 | 0.0000000 | 0.6267341 |
make_rsa_plot(input_matrix, output_matrix)
### STEPS FOR DOING ONE FULL ITERATION OF RSA MODEL
#
# 1. carve out a subset of the main data `pf_data`
# ...
#
# 2. get it into matrix format -- `input_matrix`
# ...
#
# 3. specify costs, priors, and alpha
# ...
#
# 4. call `fullRecursion()` on the matrix, with params -- `output_matrix`
# ...
#
# 5. visualize result
# make_rsa_plot(input_matrix, output_matrix)
### STEPS FOR DOING MORE THAN ONE ITERATION
# prep the data to an `input_matrix`,
# specify parameters,
# and then call `reason()` on it
# HINT: TRY SETTING THE COSTS VERY HIGH AND ALPHA VERY LOW
semantics_matrix <- matrix(
nrow=5, ncol=3, dimnames=list(paste0("row", 1:5), c("none","some","all")),
data=c(1.0, .00, .00,
.00, .00, .00,
.25, .25, .25,
.25, .00, .00,
.00, .00, 1)
)
costs <- rep(1, ncol(semantics_matrix))
priors <- rep(.5, nrow(semantics_matrix))
alpha <- 1
depth <- 1
recycle_priors <- TRUE
# same as `rsa.reason()` but source code is cleaner
pragmatics_matrix <- reason(
semantics_matrix,
costs=costs, priors=priors, alpha=alpha, depth=depth,
recycle_priors=recycle_priors
)
make_rsa_plot(semantics_matrix, pragmatics_matrix)
# same as `rsa.fullRecursion()` but source code is cleaner
# also same as `reason` but with depth=1
fullRecursion(
input_matrix,
costs=costs, priors=priors, alpha=alpha
)
## none some all
## 1 0 0.0000000 0.00000000
## 2 0 0.5205479 0.00000000
## 3 0 0.4794521 0.03797468
## 4 0 0.0000000 0.48101266
## 5 0 0.0000000 0.48101266
The following equations are from Goodman & Frank’s (2012) model specification.
\[P(r_s|w, C) = \frac{P(w|r_s, C)\times P(r_s)}{\sum\limits_{r'\in C} P(w|r', C) \times P(r')}\]
where
\[P(w|r_s, C) = \frac{|w|^{-1}}{\sum\limits_{w'\in W} |w'|^{-1}}\]
where - \(|w|\) is the number of objects in \(C\) to which word \(w\) could(?) apply, - and \(W\) is the set of words that apply to the intended referent \(r_s\).
\[P(w|r_s, C) \propto e^{\alpha\times U(w; r_s, C)}\]
where
\[U(w; r_s, C) = I(w; r_s, C) - D(w)\]
where
for a sample \(x\) from a known distribution \(p(x)\), the surprisal of \(x\) is \[I_p(x) = -log(p(x))\]
utility decreases with surprisal:
\[I(w; r_s, C) = -I_{\widetilde{w}_C}(r_s)\]
where \(\widetilde{w}_C\) is the distribution over objects that would come froma aliteral interpretation of \(w\) in context \(C\).
“if listeners interpret the utterance \(w\) literally, assigning zero probability to objects for which the word is false, they assign equal probability to each object conssitent with \(w\). This distribution over objects can be written:”
\[\widetilde{w}_C(o) = \begin{cases} \frac{1}{|w|} & \text{if } w(o) = true\\\\ 0 & \text{otherwise } \end{cases}\]
equation S4 follows from S1-S3, which is equivalent to equation 2 (aka “the size principle”)
\[P(w|r_s, C) = \frac {e^{-(-\log(|w|^{-1}))}} {\sum\limits_{w'\in V \text{s.t. } w'(r_s) = true} e^{-(-\log(|w'|^{-1}))}}\]
“Thus in our experiments, the speaker’s abstract goal of being informative reduces to a simple formulation: whoose a word that applies to the referent and picks out a relatively smaller section of the context. Listeners may then use this model of a speaker as their likelihood function, to be combined with prior information about contextual salience as in Equation 1 in the main text.”